Search CORE

Stein-type covariance identities: Klaassen, Papathanasiou and Olkin-Shepp type bounds for arbitrary target distributions

Author: Ernst Marie
Reinert Gesine
Swan Yvik
Publication venue
Publication date: 21/12/2018
Field of study

In this paper, we present a minimal formalism for Stein operators which leads to different probabilistic representations of solutions to Stein equations. These in turn provide a wide family of Stein-Covariance identities which we put to use for revisiting the very classical topic of bounding the variance of functionals of random variables. Applying the Cauchy-Schwarz inequality yields first order upper and lower Klaassen-type variance bounds. A probabilistic representation of Lagrange's identity (i.e. Cauchy-Schwarz with remainder) leads to Papathanasiou-type variance expansions of arbitrary order. A matrix Cauchy-Schwarz inequality leads to Olkin-Shepp type covariance bounds. All results hold for univariate target distribution under very weak assumptions (in particular they hold for continuous and discrete distributions alike). Many concrete illustrations are provided

Oxford University Research Archive

On the rate of convergence in de Finetti's representation theorem

Author: Mijoule Guillaume
Peccati Giovanni
Swan Yvik
Publication venue
Publication date: 01/01/2016
Field of study

A consequence of de Finetti's representation theorem is that for every infinite sequence of exchangeable 0-1 random variables

(X_k)_{k\geq1}

, there exists a probability measure

\mu

on the Borel sets of

[0,1]

such that

\bar X_n = n^{-1} \sum_{i=1}^n X_i

converges weakly to

\mu

. For a wide class of probability measures

\mu

having smooth density on

(0,1)

, we give bounds of order

1/n

with explicit constants for the Wasserstein distance between the law of

\bar X_n

and

\mu

. This extends a recent result {by} Goldstein and Reinert \cite{goldstein2013stein} regarding the distance between the scaled number of white balls drawn in a P\'olya-Eggenberger urn and its limiting distribution. We prove also that, in the most general cases, the distance between the law of

\bar X_n

and

\mu

is bounded below by

1/n

and above by

1/\sqrt{n}

(up to some multiplicative constants). For every

\delta \in [1/2,1]

, we give an example of an exchangeable sequence such that this distance is of order

1/n^\delta

Distances between nested densities and a measure of the impact of the prior in Bayesian statistics

Author: Ley Christophe
Reinert Gesine
Swan Yvik
Publication venue
Publication date: 20/10/2015
Field of study

In this paper we propose tight upper and lower bounds for the Wasserstein distance between any two {{univariate continuous distributions}} with probability densities

p_1

and

p_2

having nested supports. These explicit bounds are expressed in terms of the derivative of the likelihood ratio

p_1/p_2

as well as the Stein kernel

\tau_1

p_1

. The method of proof relies on a new variant of Stein's method which manipulates Stein operators. We give several applications of these bounds. Our main application is in Bayesian statistics : we derive explicit data-driven bounds on the Wasserstein distance between the posterior distribution based on a given prior and the no-prior posterior based uniquely on the sampling distribution. This is the first finite sample result confirming the well-known fact that with well-identified parameters and large sample sizes, reasonable choices of prior distributions will have only minor effects on posterior inferences if the data are benign

Ghent University Academic Bibliography

Crossref

Oxford University Research Archive

Open Repository and Bibliography - Luxembourg

The Adaptive Sampling Revisited

Author: Drescher Matthew
Louchard Guy
Swan Yvik
Publication venue
Publication date: 01/01/2019
Field of study

The problem of estimating the number

n

of distinct keys of a large collection of

N

data is well known in computer science. A classical algorithm is the adaptive sampling (AS).

n

can be estimated by

R.2^D

, where

R

is the final bucket (cache) size and

D

is the final depth at the end of the process. Several new interesting questions can be asked about AS (some of them were suggested by P.Flajolet and popularized by J.Lumbroso). The distribution of

W=\log (R2^D/n)

is known, we rederive this distribution in a simpler way. We provide new results on the moments of

D

and

W

. We also analyze the final cache size

R

distribution. We consider colored keys: assume that among the

n

distinct keys,

n_C

do have color

C

. We show how to estimate

p=\frac{n_C}{n}

. We also study colored keys with some multiplicity given by some distribution function. We want to estimate mean an variance of this distribution. Finally, we consider the case where neither colors nor multiplicities are known. There we want to estimate the related parameters. An appendix is devoted to the case where the hashing function provides bits with probability different from

1/2

Episciences.org

HAL Descartes

Hal-Diderot

On Hodges and Lehmann's " $6/\pi$ result"

Author: Hallin Marc
Swan Yvik
Verdebout Thomas
Publication venue
Publication date: 21/05/2013
Field of study

While the asymptotic relative efficiency (ARE) of Wilcoxon rank-based tests for location and regression with respect to their parametric Student competitors can be arbitrarily large, Hodges and Lehmann (1961) have shown that the ARE of the same Wilcoxon tests with respect to their van der Waerden or normal-score counterparts is bounded from above by

6/\pi\approx 1.910

. In this paper, we revisit that result, and investigate similar bounds for statistics based on Student scores. We also consider the serial version of this ARE. More precisely, we study the ARE, under various densities, of the Spearman-Wald-Wolfowitz and Kendall rank-based autocorrelations with respect to the van der Waerden or normal-score ones used to test (ARMA) serial dependence alternatives